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0 Method and system for incremental backup copying of data. 



© Backup copying at a first selected point in time point in time are copied, utilizing an identical tech 
is performed in a data processing system on a nique. 
storage subsystem concurrent with an application 
execution by first suspending application execution 
only long enough to form a logical-to-physical ad- 
dress concordance, and thereafter physically back- 

^ ing up the datasets on a scheduled or opportunistic 

^ basis. An indication of each update to a selected 
portion of the designated datasets occuring after the 

CO first selected point in time is stored and application 
initiated updates to uncopied designated datasets 

CO are first buffered. Thereafter, sidefiles are made of 

O the affected datasets, or portions thereof, the up- 

^ dates are then written through to the storage sub- 

O system, and the sidefiles written to an alternate 

q storage location in backup copy order, as controlled 

m by the address concordance. At a subsequent point 
in time only those portions of the designated data- 
sets which have been updated after the first selected 
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This invention relates in general to methods 
and systems for maintaining continued availability 
of datasets in external storage associated with ac- 
cessing data processing systems, and in particular 
the present invention relates to backup copying of 
records in external storage concurrent with a dra- 
matically shortened suspension of data processing 
system application execution occasioned by such 
copying. Still more particularly, the present inven- 
tion relates to a method and system for incre- 
mental backup copying of records in a data pro- 
cessing system which minimizes still further the 
suspension of data processing system application 
execution during such copying. 

The present application is related to European 
Patent Application No. PCT/EP92/02127 which is 
hereby incorporated herein by reference thereto. 

A modern data processing system must be 
prepared to recover, not only from corruptions of 
stored data which occur as a result of noise bursts, 
software bugs, media defects, and write path er- 
rors, but also from global events, such as data 
processing system power failure. The most com- 
mon technique of ensuring the continued availabil- 
ity of data within a data processing system is to 
create one or more copies of selected datasets 
within a data processing system and store those 
copies in a nonvolatile environment. This so-called 
"backup" process occurs within state-of-the-art ex- 
ternal storage systems in modern data processing 
systems. 

Backup policies are implemented as a matter 
of scheduling. Backup policies have a space and 
time dimension which is exemplified by a range of 
datasets and by the frequency of backup occur- 
rence. A FULL backup requires the backup of an 
entire range of a dataset, whether individual por- 
tions of that dataset have been updated or not. An 
INCREMENTAL backup copies only that portion of 
the dataset which has been updated since a pre- 
vious backup, either full or incremental. The bac- 
kup copy thus created represents a consistent view 
of the data within the dataset as of the time the 
copy was created. 

Of course, those skilled in the art will appre- 
ciate that as a result of the process described 
above, the higher the backup frequency, the more 
accurately the backup copy will mirror the current 
state of data within a dataset. In view of the large 
volumes of data maintained within a typical state- 
of-the-art data processing system backing up that 
data is not a trivial operation. Thus, the opportunity 
cost of backing up data within a dataset may be 
quite high on a large multiprocessing, multipro- 
gramming facility, relative to other types of pro- 
cessing. 

Applications executed within a data processing 
system are typically executed in either a batch 



(streamed) or interactive (transactional) mode. In a 
batch mode, usually one application at a time ex- 
ecutes without interruption. Interactive mode is 
characterized by interrupt driven multiplicity of ap- 

5 plications or transactions. 

When a data processing system is in the pro- 
cess of backing up data in either a streamed or 
batch mode system, each process, task or applica- 
tion within the data processing system is affected. 

w That is, the processes supporting streamed or 
batch mode operations are suspended for the dura- 
tion of the copying. Those skilled in the art will 
recognize that this event is typically referred to as 
a "backup window." In contrast to batch mode 

75 operations, log based or transaction management 
applications are processed in the interactive mode. 
Such transaction management applications elimi- 
nate the "backup window" by concurrently updat- 
ing an on-line dataset and logging the change. 

20 However, this type of backup copying results in a 
consistency which results in a consistency de- 
scribed as "fuzzy." That is, the backup copy is not 
a precise "snapshot" of the state of a dataset/data 
base at a single point in time. Rather, a log com- 

25 prises an event file requiring further processing 
against the database. 

European Patent Application No. 90307839.2 
illustrates backup in a batch mode system utilizing 
a modified incremental policy. A modified incre- 

30 mental policy copies only new data or data updates 
since the last backup. It should be noted that 
execution of applications within the data processing 
system are suspended during copying in this sys- 
tem. 

35 As described above, to establish a prior point 

of consistency in a log based system, it is neces- 
sary to "repeat history" by replaying the log from 
the last check point over the datasets or database 
of interest. The distinction between batch mode 

40 and log based backup is that the backup copy is 
consistent and speaks as of the time of its last 
recordation, whereas the log and database mode 
require further processing in the event of a fault, in 
order to exhibit a point in time consistency. 

45 United States Patent No. 4,507,751 exemplifies 

a transaction management system wherein all 
transactions are recorded on a log on a write-ahead 
dataset basis. As described within this patent, a 
unit of work is first recorded on the backup me- 

so dium (log) and then written to its external storage 
address. 

United States Patent Application Serial No. 
07/524,206 teaches the performance of media 
maintenance on selected portions of a tracked 
55 cyclic operable magnetic media concurrent with 
active access to other portions of the storage me- 
dia. The method described therein requires the 
phased movement of customer data between a 
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target track to an alternate track, diversion ot al! 
concurrent access requests to the alternate track or 
tracks and the completion of maintenance and 
copy back from the alternate to the target track. 

Requests and interrupts which occur prior to 
executing track-to-track customer data movement 
result in the restarting of the process. Otherwise, 
requests and interrupts occurring during execution 
of the data movement view a DEVICE BUSY state. 
This typically causes. a requeueing of the request. 

It should therefore be apparent that a need 
exists for a method and system whereby the maxi- 
mum availability of application execution within a 
data processing system is maintained while creat- 
ing backup copies which exhibit a consistent view 
of data within an associated database, as of a 
specific time. 

It is therefore one object of the present inven- 
tion to provide an improved method and system for 
maintaining continued availability of datasets in ex- 
terna! storage associated with accessing data pro- 
cessing systems. 

It is another object of the present invention to 
provide an improved method and system for bac- 
kup copying of records in external storage concur- 
rent with a dramatically shortened suspension of 
data processing system application execution oc- 
casioned by such copying. 

It is yet another object of the present invention 
to provide an improved method and system for 
incremental backup copying of records in a data 
processing system which minimizes, still further, 
the suspension of data processing system applica- 
tion execution during such copying, as well as the 
actual amount of data which must be backed up. 

The foregoing objects are achieved by the in- 
vention as claimed. 

Backup copying at a first selected point in time 
is performed in a data processing system on a 
storage subsystem concurrent with an application 
execution by first suspending application execution 
only long enough to form a logical-to-physical ad- 
dress concordance, and thereafter physically back- 
ing up the datasets on a scheduled or opportunistic 
basis. An indication of each update to a selected 
portion of the designated datasets occuring after 
the first selected point in time is stored and ap- 
plication initiated updates to uncopied designated 
datasets are first buffered. Thereafter, sidefiles are 
made of the affected datasets, or portions thereof, 
the updates are then written through to the storage 
subsystem, and the sidefiles written to an alternate 
storage location in backup copy order, as con- 
trolled by the address concordance. At a subse- 
quent point in time only those portions of the 
designated datasets which have been updated after 
the first selected point in time are copied, utilizing 
an identical technique. 



The novel features believed characteristic of 
the invention are set forth in the appended claims. 
The invention itself however, as well as a preferred 
mode of use, further objects and advantages there- 
5 of, will best be understood by reference to the 
following detailed description of an illustrative em- 
bodiment when read in conjunction with the accom- 
panying drawings, wherein: 

Figure 1 depicts a typical multiprocessing, multi- 
70 programming environment according to the prior 
art where executing processors and applications 
randomly or sequentially access data from ex- 
ternal storage; 

Figures 2A-2C depict time line illustrations of 
75 the backup window in a batch or streaming 
process in the prior art, in a time zero backup 
system and in an incremental time zero backup 
system, respectively; 

Figure 3 illustrates a conceptual flow of an in- 
20 cremental time zero backup copy in accordance 
with the method and system of the present 
invention; 

Figure 4 is a high level logic flowchart illustrat- 
ing initialization of an incremental time zero bac- 
25 kup copy in accordance with the method and 
system of the present invention; and 
Figure 5 is a high level logic flowchart illustrat- 
ing incremental backup copying in accordance 
with the method and system of the present 
30 invention. 

With reference now to the figures and in par- 
ticular with reference to Figure t , there is depicted 
a multiprocessing, multiprogramming data process- 
ing system according to the prior art. Such sys- 
35 terns typically include a plurality of processors 1 
and 3 which access external storage units 21, 23, 
25, 27, and 29 over redundant channel 
demand/response interfaces 5, 7 and 9. 

The illustrated embodiment in Figure 1 may be 
40 provided in which each processor within the data 
processing system is implemented utilizing an 
IBM/360 or 370 architected processor type having, 
as an example, an IBM MVS operating system. An 
IBM/360 architected processor is fully described 
45 U.S. Patent No. 3,400,371. A configuration in which 
multiple processors share access to external stor- 
age units is set forth in U.S. Patent No. 4,207,609. 

The MVS operating system is also described in 
IBM Publication GC28-1150, entitled 
so MVS/Extended Architecture System Programming 
Library: System Macros and Facilities, Vol. 1. De- 
tails of standard MVS or other operating system 
services, such as local lock management, sub- 
system invocation by interrupt or monitor, and the 
55 posting and waiting of tasks is omitted. These 
operating systems services are believed to be well 
known to those having skill in this art. 
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Still referring 1o Figure 1, as described in U.S. 
Patent No.4,207,609, a processor process may es- 
tablish a path to externally stored data in an IBM 
System 370 or similar system through an MVS or 
other known operating system by invoking a 
START I/O, transferring control to a channel sub- 
system which reserves a path to the data over 
which transfers are made. Typically, executing ap- 
plications have data dependencies and may briefly 
suspend operations until a fetch or update has 
been completed. During such a transfer, the path is 
locked until the transfer is completed. 

Referring now to Figures 2A-2C, there are de- 
picted time lines illustrating the backup window in a 
batch or streaming process in the prior art, in a 
time zero backup system and in an incremental 
time zero backup system respectively. As illus- 
trated at Figure 2A, multiple backup operations 
have occurred, as indicated at backup windows 41 
and 43. Application processing is typically sus- 
pended or shut down just prior to each backup 
window and this suspension will persist until the 
backup process has been completed. Termination 
of the backup window signifies completion of the 
backup process and commitment. By "completion'' 
what is meant is that all data that was to have been 
copied was in fact read from the source. By 
"commitment" what is meant is that all data to be 
copied was in fact written to an alternate storage 
location. 

Referring now to Figure 2B, backup windows 
for a time zero backup copy system are depicted. 
As described in detail within European Patent Ap- 
plication No. PCT/EP92/02127, each backup win- 
dow 45 and 47 still requires the suspension or 
termination of application processing; however, the 
suspension or termination occurs only for a very 
short period of time. As described in European 
Patent Application No. PCT/EP92/02127, the time 
zero backup method begins, effectively freezing 
data within the datasets to be backed up at that 
point in time. Thereafter, a bit map is created 
identifying each track within the datasets to be 
backed up and after creation of that bit map, the 
copy is said to be "logically complete." The com- 
mitted state, or "physically complete" state will not 
occur until some time later. However, at the 
"logically complete" point in time, the data is com- 
pletely usable by applications within the data pro- 
cessing system. The time during which application 
processing is suspended in such a system is gen- 
erally in the low sub-second range; however, those 
skilled in the art will appreciate that the amount of 
time required to create a bit map to the data to be 
copied will depend upon the amount of data within 
the datasets. 

Of course, those skilled in the art will appre- 
ciate that if the time zero backup process termi- 



nates abnormally between the point of logical com- 
pletion and the point of physical completion, the 
backup copy is no longer useful and the process 
must be restarted. In this respect, the time zero 
5 backup process is vulnerable in a manner very 
similar to that of backup systems in the prior art. 
That is, all backup operations must be rerun if the 
process terminates abnormally prior to completion. 
Referring now to Figure 2C, the incremental 

w time zero backup copying process of the invention 
is depicted. As above, an initial backup window 49 
exists which requires a temporary suspension or 
termination of application processing; however, in a 
manner which will be explained in greater detail 

rs herein, updates to the dataset which occur after the 
initial backup copy has begun are tracked utilizing 
an alternate bit map of the designated dataset. 
Thereafter, only those tracks within the designated 
dataset which have been altered are copied during 

20 a subsequent incremental copy session. Since the 
creation of a bit map identifying those tracks within 
the dataset which have been updated since a pre- 
vious full copy has been completed occurs during 
the update process, application processing need 

25 not be suspended until the next time a full copy is 
desired. In this manner, suspension or interruption 
of application processing is substantially reduced. 

With reference now to Figure 3, there is de- 
picted a conceptual flow of the creation of an 

30 incremental time zero backup copy in accordance 
with the method and system of the present inven- 
tion. As illustrated, an incremental time zero bac- 
kup copy of data within a tracked cyclic storage 
device 61 may be created. As those skilled in the 

35 art will appreciate, data stored within such a device 
is typically organized into records and datasets. 
The real address of data within external storage is 
generally expressed in terms of Direct Access 
Storage Device (DASD) volumes, cylinders and 

40 tracks. The virtual address of such data is generally 
couched in terms of base addresses and offsets 
and/or extents from such base addresses. 

Further, a record may be of the count-key-data 
format. A record may occupy one or more units of 

45 real storage. A "dataset" is a logical collection of 
multiple records which may be stored on contig- 
uous units of real storage or which may be dis- 
persed. Therefore, those skilled in the art will ap- 
preciate that if backup copies are created at the 

so dataset level it will be necessary to perform mul- 
tiple sorts to form inverted indices into real storage. 
For purposes of explanation of this invention, bac- 
kup processing will be described as managed both 
at the resource manager level within a data pro- 

55 cessing system and at the storage control unit 
level. 

As described above, each processor typically 
includes an operating system which includes a 
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resource manager component. Typically, an IBM 
System 370 type processor running under the MVS 
operating system will include a resource manager 
of the data facilities dataset services (DFDSS) type 
which is described in U.S. Patent No. 4,855,907. 
DFDSS is also described in IBM Publication GC26- 
4388, entitled Data Facility Dataset Services: 
User's Guide. Thus, a resource manager 63 is 
utilized in conjunction with a storage control unit 65 
to create an incremental backup copy of desig- 
nated datasets stored within tracked cyclic storage 
device 61 . 

As will be described below, the backup copy 
process includes an initialization period during 
which datasets are sorted, one or more bit maps 
are created and logical completion of the bit map is 
signaled to the invoking process at the processor. 
The listed or identified datasets are then sorted 
according to access path elements down to DASD 
track granularity. Next, bit maps are constructed 
which correlate the dataset and the access path 
insofar as any one of them is included or excluded 
from a given copy session. Lastly, storage man- 
ager 63 signals logical completion, indicating that 
updates wilt be processed against the dataset only 
after a short delay until such time as physical 
completion occurs. 

Backup copying of designated datasets repre- 
senting a first selected point in time consistency 
may be performed in a data processing system on 
an attached storage subsystem concurrent with 
data processing system application execution by 
first suspending application execution only long 
enough to form a logical-to-physical address con- 
cordance, and thereafter physically backing up the 
datasets on the storage subsystem on a scheduled 
or opportunistic basis. An indication of each update 
to a selected portion of the designated datasets 
which occurs after the first selected point in time is 
stored and application initiated updates to uncopied 
designated datasets are first buffered. Thereafter, 
sidefiles are made of the affected datasets, or 
portions thereof, the updates are then written 
through to the storage subsystem, and the sidefiles 
written to an alternate storage location in backup 
copy order, as controlled by the address concor- 
dance. At a subsequent point in time only those 
portions of the designated datasets which have 
been updated after the first selected period and 
time are copied, utilizing an identical technique. 

Following initialization, storage manager 63 be- 
gins reading the tracks of data which have been 
requested. While a copy session is active, each 
storage control unit monitors all updates to the 
dataset. If an update is received from another ap- 
plication 67, storage control unit 65 will execute a 
predetermined algorithm to process that update, as 
described below. 



In a time zero backup copy system a deter- 
mination is first made as to whether or not the 
update attempted by application 67 is for a volume 
which is not within the current copy session. If the 
5 volume is not within the current copy session, the 
update completes normally. Alternately, if the up- 
date is for a volume which is part of the copy 
session, the primary session bit map is checked to 
see if that track is protected. If the corresponding 
70 bit within the bit map is off, indicating the track is 
not currently within a copy session, the update 
completes normally. However, if the track is pro- 
tected (the corresponding bit within the bit map is 
on) the track in question is part of the copy session 
rs and has not as yet been read by the storage 
manager 63. In such a case, storage control unit 65 
temporarily buffers the update and writes a copy of 
the affected track from tracked cyclic storage de- 
vice 61 into a memory within storage control unit 
20 65. Thereafter, the update is permitted to complete. 

Thus, as illustrated in Figure 3, an update 
initiated by application 67 may be processed 
through storage control unit 65 to update data at 
tracks 3 and 5 within tracked cyclic storage unit 61 . 
25 Prior to permitting the update to occur, tracks 3 
and 5 are written as sidefiles to a memory within 
storage control unit 65 and thereafter, the update is 
permitted to complete. The primary bit map is then 
altered to indicate that the copies of tracks 3 and 5, 
30 as those tracks existed at the time a backup copy 
was requested, are no longer within tracked cyclic 
storage device 61 but now reside within a memory 
within storage control unit 65. 

A merged copy, representing the designated 
35 dataset as of the time a backup copy was re- 
quested, is then created at reference numeral 69, 
by copying non-updated tracks directly from 
tracked cyclic storage device 61 through resource 
manager 63, or by indirectly copying those tracks 
40 from tracked cyclic storage device 61 to a tem- 
porary host sidefile 71, which may be created 
within the expanded memory store of a host pro- 
cessor. Additionally, tracks within the dataset which 
have been written to sidefiles within a memory in 
45 storage control unit 65 prior to completion of an 
update may also be indirectly read from the mem- 
ory within storage control unit 65 to the temporary 
host sidefile 71 . Those skilled in the art will appre- 
ciate that in this manner a copy of a designated 
so dataset may be created from unaltered tracks with- 
in tracked cyclic storage device 61, from updated 
tracks stored within memory of storage control unit 
65 and thereafter transferred to temporary host 
sidefile 71, wherein these portions of the des- 
55 ignated dataset may be merged in backup copy 
order, utilizing the bit map which was created at 
the time the backup copy was initiated. 
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Referring now to Figure 4, there is depicted a 
high level logic flowchart which illustrates the initial- 
ization of a process for creating an incremental 
time zero backup copy, in accordance with the 
method and system of the present invention. As 
illustrated, this process starts at block 81 and 
thereafter passes to block 83 which illustrates the 
beginning of the initialization process. Thereafter, 
the process passes to block 85 which depicts the 
sorting of the datasets by access path, down to 
DASD track granularity. This sorting process will, 
necessarily, resolve an identification of the DASD 
volumes within which the datasets reside and the 
identification of the storage control units to which 
those volumes belong. 

Next, as depicted at block 87, a session iden- 
tification is established between each processor 
and the relevant external storage control units. The 
session identification is preferably unique across all 
storage control units, in order that multiple proces- 
sors will not interfere with each others' backup 
copy processes. Thereafter, as illustrated at block 
89, a primary session bit map is established which 
may be utilized, as set forth in detail herein and 
within the cross-referenced patent application, to 
indicate whether or not a particular track is part of 
the present copy session. Thereafter, as depicted 
at block 91, the "logically complete" signal is sent 
to the invoking process, indicating that application 
processing may continue; however, slight delays in 
updates will occur until such time as the backup 
copy is physically complete. 

With reference now to Figure 5, there is de- 
picted a high level logic flowchart which illustrates 
the incremental backup copying of a dataset in 
accordance with the method and system of the 
present invention. As illustrated, the process begins 
at block 99 and thereafter passes to block 101. 
Block 101 depicts the beginning of the reading of a 
backup copy. The process then passes to block 
103 which illustrates a determination of whether or 
not the backup copy is to be a "FULL" copy or a 
"INCREMENTAL" copy. As described above, a 
FULL copy is a copy of each element within a 
designated dataset, regardless of whether or not 
the data within the dataset has been previously 
altered. An INCREMENTAL copy is a copy which 
only includes those portions of the dataset which 
have been updated or altered since the previous 
backup copy occurred. 

Still referring to block 103, in the event a FULL 
copy is to be created, the process passes to block 
107 which depicts the establishment of an alternate 
session bit map. As will be described in greater 
detail herein, an alternate session bit map is uti- 
lized to track alterations or updates to portions of 
the designated dataset which occur after the initi- 
ation of a previous backup copy, such that an 



INCREMENTAL copy of only those portions of the 
dataset which have been altered may be created at 
a subsequent time. Alternately, in the event an 
INCREMENTAL copy is to be created, the process 

5 passes from block 103 to block 105, which illus- 
trates the changing of the designation of the al- 
ternate session bit map to that of the primary 
session bit map, and the process then passes to 
block 107, which again illustrates the establishment 

w of an alternate session bit map. 

Thus, upon the initiation of a FULL backup 
copy, an alternate session bit map is created to 
track changes to the dataset which occur after the 
initiation of the full copy. Thereafter, if an IN- 

75 CREMENTAL copy is to be created, the previously 
established alternate session bit map is utilized as 
the primary session bit map and a new alternate 
session bit map is created to permit the system to 
track changes to the data within the dataset which 

20 occur after the initiation of the INCREMENTAL 
copy. 

Next, block 109 illustrates a determination of 
whether or not an update has occurred. In the 
event no update has occurred, the process merely 

25 iterates until such time as an update does occur. In 
the event an update has occurred, the process 
passes to block 111. Block 111 illustrates a deter- 
mination of whether or not the update initiated by 
an application within the data processing system is 

30 an update against a portion of the time zero data- 
set. If not, the process merely passes to block 113 
and the update is processed in a user transparent 
fashion. However, in the event the update is against 
a portion of the time zero dataset, the process 

35 passes to block 1 1 5. 

Block 115 illustrates a determination of whether 
or not the update is against a copied or uncopied 
portion of the time zero dataset. That is, an update 
to a portion of data within the dataset which has 

40 been copied to the backup copy and is therefore 
physically complete, or a portion which has not yet 
been copied to the backup copy. If the portion of 
the dataset against which the update is initiated 
has already been copied to the backup copy, the 

45 process passes to block 117 which illustrates the 
marking of the alternate session bit map, to in- 
dicate that this portion of the dataset has been 
altered since the previous backup copy was ini- 
tiated. Thereafter, the process passes to block 113 

so which illustrate the processing of the update. Again, 
the process then passes from block 113 to block 
1 09, to await the occurrence of the next update. 

Referring again to block 115, in the event the 
update against the time zero dataset is initiated 

55 against a portion of the time zero dataset which 
has not yet been copied to the backup copy, the 
process passes to block 119. Block 119 illustrates 
the temporary buffering of the update and the 
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copying of the affected portion of the time zero 
dataset to a sidefile within memory within the stor- 
age control unit (see Figure 3). Thereafter, the 
process passes to block 121, which illustrates the 
marking of the alternate session bit map to indicate 5 
that an update has occurred with respect to this 
portion of the dataset since the initiation of the 
previous backup copy. 

Next, the process passes to block 123, which 
illustrates the marking of the primary session bit 10 
map, indicating to the resource manager that this 
portion of the dataset has been updated within the 
external storage subsystem and that the time zero 
copy of this portion of the dataset is now either 
within cache memory within storage control unit 65 75 
or within temporary host sidefile 71 which is uti- 
lized to prevent overflow of data within the cache 
memory within storage control unit 65 (see Figure 
3). 

After marking the primary session bit map, the 20 
process passes to block 125 which illustrates the 
processing of that update. Thereafter, the process 
passes to block 127 which depicts a determination 
of whether or not the sidefile threshold within the 
cache memory of storage control unit 65 has been 25 
exceeded. If so, the process passes to block 129, 
which illustrates the generation of an attention sig- 
nal, indicating that sidefiles within the storage con- 
trol unit are ready to be copied by the processor. 
Of course, those skilled in the art will appreciate 30 
that a failure to copy data from the cache memory 
within storage control unit 65 may result in the 
corruption of the backup copy if that memory is 
overwritten. Referring again to block 127, in the 
event the sidefile threshold has not been exceeded, 35 
the process returns again to block 109 to await the 
occurrence of the next update. 

The asynchronous copying of sidefile data 
from a cache memory within storage control unit 65 
to a temporary host sidefile, or to the merged 40 
backup copy, is described in detail within European 
Patent Application No. PCT/EP92/02127, as well as 
the process by which merged copies are created 
which incorporate data read directly from tracked 
cyclic storage unit 61, data within cache memory 45 
within storage control unit 65 and/or data within 
temporary host sidefile 71 . 

Thus, upon reference to the foregoing those 
skilled in the art will appreciate that by initiating a 
time zero backup copy the suspension of applica- 50 
tion execution which normally accompanies a bac- 
kup copy session is substantially reduced by the 
expedient of creating a bit map identifying each 
portion of data within the designated dataset to be 
updated and thereafter releasing the dataset for 55 
application execution. Portions of the designated 
dataset within the external storage subsystem are 
then copied on an opportunistic or scheduled basis 



and attempted updates to the data contained there- 
in are deferred temporarily, until such time as the 
original data, as it existed as of the time of the 
backup copy, may be written to a sidefile lor inclu- 
sion within the completed backup copy. Thereafter, 
the updates are written to the data within the exter- 
nal storage subsystem. 

The method and system of the present inven- 
tion may be utilized to create an alternate bit map 
which is automatically established each time an 
update occurs or the system begins reading a 
backup copy. This alternate bit map is then utilized 
to track alterations to the data which occurs after 
the initial backup copy is created and, at subse- 
quent backup points, this bit map is utilized to 
facilitate the copying of only those portions of the 
designated dataset which have been updated since 
the previous backup copy was created. At the 
initiation of a subsequent INCREMENTAL copy, 
this alternate bit map becomes the primary bit map 
and another alternate bit map is created to track 
alterations or updates which occur to the data after 
the INCREMENTAL copy is initiated. In this man- 
ner, the termination or suspension of application 
execution within a data processing system during 
backup copying is substantially eliminated. For ex- 
ample, those skilled in the art will appreciate that 
sidefiles of affected tracks generated as a result of 
an update prior to physical completion may be 
stored within a cyclic tracked storage device 61 at 
an unusual location, rather than in memory within 
the storage control unit, as depicted in the illus- 
trated embodiment. 

Claims 

1. A method in a data processing system for 
incremental backup copying of designated 
datasets stored within one or more storage 
subsystems coupled to said data processing 
system during application execution within said 
data processing system, said method compris- 
ing the steps ol: 

suspending application execution within 
said data processing system at a first point in 
time, forming a dataset logical-to-physical stor- 
age system address concordance for said des- 
ignated datasets and resuming application ex- 
ecution thereafter; 

physically backing up said designated 
datasets within said one or more storage sub- 
systems on a scheduled or opportunistic basis 
by copying said designated datasets from said 
one storage subsystems to alternate storage 
subsystem locations; 

storing an indication of each application 
initiated update to said designated datasets 
which occurs after said first point in time; 
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processing at said one or more storage 
subsystems any application initiated updates 
to uncopied designated datasets by buffering 
said updates, writing sidefiles of said desig- 
nated datasets or portions thereof affected by 5 
said updates, writing said updates to said one 
or more storage subsystems, and copying on a 
scheduled or opportunistic basis said sidefiles 
to said alternate storage subsystem location in 
conjunction with said copied designated data- w 
sets from said one or more storage sub- 
systems in an order defined by said address 
concordance; and 

creating an incremental backup copy of 
said designated datasets at a designated time 75 
subsequent to said first point in time by copy- 
ing only those designated datasets or portions 
thereof updated after said first point in time. 

2. The method in a data processing system for 20 
incremental backup copying of designated 
datasets stored within one or more storage 
subsystems coupled to said data processing 
system according to Claim 1, wherein said 
step of creating an incremental backup copy of 25 
said designated datasets comprises the steps 

of: 

forming a second dataset logical-to-phys- 
ical storage system address concordance at 
said designated time for each designated data- 30 
set or portion thereof updated after said first 
point in time; 

physically backing up said designated 
datasets updated after said first point in time 
on a scheduled or opportunistic basis by copy- 35 
ing said designated datasets updated after said 
first point in time from said one or more stor- 
age subsystems to alternate storage sub- 
system locations; and 

processing at said one or more storage 40 
subsystems any application initiated updates 
to uncopied designated datasets previously up- 
dated after said first point in time by buffering 
said updates, writing sidefiles of said desig- 
nated datasets of portions thereof affected by 45 
said updates, writing said updates to said one 
or more storage subsystems, and copying on a 
scheduled or opportunistic basis said sidefiles 
to said alternate storage location in conjunction 
with said copied designated datasets in an 50 
order defined by said second address concor- 
dance. 

3. A method in a data processing system for 
incremental backup copying of designated 55 
datasets stored within one or more tracked 
cyclic storage devices coupled to said data 
processing system during application execu- 



tion within said data processing system, said 
method comprising the steps of: 

suspending application execution within 
said data processing system at a first point in 
time in response to a request for a backup 
copy of at least one dataset stored within said 
one or more tracked cyclic storage devices; 

forming a dataset and device track concor- 
dance for said at least one dataset and sig- 
naling said data processing system of the 
completion thereof; 

resuming application execution within said 
data processing system in response to said 
completion signal; 

copying said at least one dataset from said 
one or more tracked cyclic storage devices on 
a scheduled or opportunistic basis to an al- 
ternate storage subsystem; 

storing an indication of each application 
initiated update to any portion of said at least 
one dataset which occurs after said first point 
in time; 

processing application initiated updates to 
uncopied portions of said at least one dataset 
by buffering said updates, writing sidefiles of 
said affected portions of said at least one data- 
set, writing said updates to said one or more 
tracked cyclic storage devices and copying 
said sidefiles to said alternate storage location; 
and 

creating an incremental backup copy of 
said at least one dataset at a designated time 
subsequent to said first point in time by copy- 
ing to said alternate storage system location 
only those portions of said at least one dataset 
which have been updated after said first point 
in time. 

4. A data processing system for performing in- 
cremental backup copying of designated data- 
sets stored within one or more storage sub- 
systems coupled to said data processing sub- 
system during application execution within said 
data processing system, said data processing 
system comprising: 

means for suspending application execu- 
tion within said data processing system at a 
first point in time; 

means for forming a dataset logical-to- 
physical storage system address concordance 
for said designated datasets at said first point 
in time; 

means for resuming application execution 
thereafter; 

means for physically backing up said des- 
ignated datasets within said one or more stor- 
age subsystems on a scheduled or opportunis- 
tic basis by copying said designated datasets 
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from said one storage subsystems to alternate 
storage subsystem locations; 

means for storing an indication of each 
application initiated update to said designated 
datasets which occurs after said first point in 5 
time; 

means for processing at said one or mote 
storage subsystems any application initiated 
updates to uncopied designated datasets by 
buffering said updates, writing sidefiles of said io 
designated datasets or portions thereof affect- 
ed by said updates, writing said updates to 
said one or more storage subsystems, and 
copying on a scheduled or opportunistic basis 
said sidefiles to said alternate storage sub- 75 
system location in conjunction with said copied 
designated datasets from said one more stor- 
age subsystems in an order defined by said 
address concordance; and 

means for creating an incremental backup 20 
copy of said designated datasets at a des- 
ignated time subsequent to said first point in 
time by copying only those designated data- 
sets or portions thereof updated after said first 
point in time. 25 

5. A storage control unit having a cache memory 
for permitting incremental backup copying of 
designated datasets stored within a storage 
subsystem associated therewith by a data pro- 30 
cessing system coupled thereto, said storage 
control unit comprising: 

means for forming a dataset logical-to- 
physical storage address concordance for said 
designated datasets within said storage sub- 35 
system at a first point in time; 

means for permitting copying of said des- 
ignated datasets within said stored designated 
datasets on a scheduled or opportunistic basis 
by said data processing system; 40 

means for storing an indication of each 
update to a portion of said designated datasets 
which occurs after said first point in time; 

means for processing updates to uncopied 
portions of said designated datasets by buf- 45 
fering said updates, writing sidefiles of said 
uncopied portions of said designated datasets 
affected by said updates within said cache 
memory and writing said updates into said 
associated storage subsystems; 50 

means for permitting copying of said 
sidefiles by said data processing system; and 

means for permitting selective copying at 
a designated time after said first point in time 
of said portions of said designated datasets 55 
updated after said first point in time. 
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